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Searches for gravitational waves from binary neutron stars or sub-solar mass black holes by the LIGO Scien- 
tific Collaboration use the FINDCHIRP algorithm: an implementation of standard matched filter techniques with 
innovations to improve performance on detector data that has non-stationary and non-Gaussian artifacts. We 
provide details on the methods used in the FINDCHIRP algorithm and describe some future improvements. 
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I. INTRODUCTION 

For the detection of a known modulated sinusoidal signal, 
such as the anticipated gravitational waveform from binary 
inspiral, in the presence of stationary and Gaussian noise, it 
is well known that the use of a matched filter is the opti- 
mal detection strategy 1 1]. Some practical complications arise 
for the gravitational wave detection problem because: (i) the 
signal is not precisely known — it is parameterized by the bi- 
nary companion's masses, an initial phase, the time of arrival, 
and various parameters describing the distance and orienta- 
tion of the system relative to the detector that can be com- 
bined into a single parameter we call the "effective distance," 
and (ii) the detector noise is not perfectly described as a sta- 
tionary Gaussian process. Standard techniques for extending 
the simple matched-filter to search over the unknown param- 
eters involve using a quadrature sum of matched filter outputs 
for orthogonal-phase waveforms (thereby eliminating the un- 
known phase), use of Fourier transform to efficiently apply 
the matched filters for different times of arrival, and use of a 
bank of templates to cover the parameter space of binary com- 
panion masses |2j,|3||. Methods for making the matched filter 
more robust against non-Gaussian noise artifacts, e.g., by ex- 
amining the relative contributions of frequency-band-limited 
matched-filter outputs (vetoing those transients that produce 
large matched filter outputs but have a time-frequency de- 
composition that is inconsistent with the expected waveform), 
have also been explored |4]. The FINDCHIRP algorithm is 
an implementation of these well-known methods. Several as- 
pects of the algorithm have been described in passing before 
0,0,0 HI' but here we provide a detailed and comprehensive 
description of our algorithm as used in the LIGO Scientific 
Collaboration search for binary neutron star signals. 

The FINDCHIRP algorithm is the part of the search that (i) 
computes the matched filter response to the interferometer 
data for each template in a bank of templates, (ii) computes a 
chi-squared discriminant (if needed) to reject instrumental ar- 
tifacts that produce large spurious excitations of the matched 
filter but otherwise do not resemble an expected signal, and 
(iii) selects candidate events or triggers based on the matched 
filter and chi-squared outputs. This is a fundamental part of 
the search for binary neutron star signals, but the search also 
consists of several other important steps such as data selection 
and conditioning, template bank generation, rejection of can- 



didate events by vetoes based on auxiliary instrumental chan- 
nels, and multidetector coincidence and coherent follow-up of 
triggers. The entire search pipeline, which is a transformation 
of raw interferometer data into candidate events, contains all 
these aspects. A description of the pipeline used in the search 
for binary neutron stars in the first LIGO science run (SI) is 
described in |6] and the pipeline used in the second LIGO sci- 
ence run (S2) is described in 0,0]. 

This paper is not intended to provide documentation for our 
implementation of the FINDCHIRP algorithm. (This can be 
found in Refs. .) Indeed, some of the notation pre- 

sented in this paper differs from the implementation in the 
LIGO Algorithm Library. Rather, this paper is intended to 
describe the algorithm itself. 



II. NOTATION 

Our conventions for the Fourier transform are as follows. 
For continuous quantities, the forward and inverse Fourier 
transforms are given by 



and 



x(t) 



x(t)e- 2nift dt 



Kf)e 27Tlft df 



(2.1a) 
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respectively, so x(f) is the Fourier transform of x(t). If these 
continuous quantities are discretized so that x[j] = x(jAt) 
where 1/ At is the sampling rate and j = 0, . . . , N — 1 are 
N sample points, then the discretized approximation to the 
forward and inverse Fourier transforms are 



N-i 

x[k] = At ^ x\j]e- 2nijk 



and 
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(2.2a) 



(2.2b) 



fc=0 



where A/ = 1/ (N At) and x[k) is an approximation to the the 
value of the continuous Fourier transform at frequency kAf: 
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x[k] « x(fcA/) for < k < [N/2\ and x[k] « s((Jfe - 
N)Af) for L^V/2J < k < N (negative frequencies). Here 
[a J means the greatest integer less than or equal to a. The 
DC component is k = and, when iV is even, k = N/2 
corresponds to the Nyquist frequency. 

Notice that our convention is to have the Fourier compo- 
nents x[k] normalized so as to have the same units as the 
continuous Fourier transform x(f), i.e., the discretized ver- 
sions of the continuous forward and inverse Fourier trans- 
forms carry the normalization constants At and A/ respec- 
tively. Numerical packages instead compute the discrete 
Fourier transform (DFT): 



y[k] 



N-l 
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x[j\e 



(2.3) 



where the minus sign in the exponential refers to the for- 
ward DFT and the positive sign refers to the reverse 1 DFT. 
The DFT is efficiently implemented via the fast Fourier trans- 
form (FFT) algorithm. Thus, it is important to write the most 
computationally-sensitive equation in the form of Eq. (12. 3> so 
that this computation can be done most efficiently. 

Throughout this paper we will reserve the indices j to be 
a time index (which labels a particular time sample), k to be 
a frequency index (which labels a particular frequency bin), 
m to be an index over a bank of templates, and n to be an 
index over analysis segments. Thus, for example, the quantity 
z m,n[j] will be the jth sample of analysis segment n of the 
matched filter output for the rath template, and z m ^ n [k] will be 
the fcth frequency bin of the Fourier transform of the matched 
filter output for the same template and analysis segment. 



III. WAVEFORM 



where D is the distance from the source, i is the angle be- 
tween the direction to the observer and the angular momen- 



tum axis of the binary system, M. = ii 3 / 5 M 2 / 5 



! / 5 M 



(where M = m\ + 7712 is the total mass of the two compan- 
ions, the reduced mass is fi = m\rn,2/M, and rj — fi/M) 
is the chirp mass, and <fi(t — t coa y, M, p) is the orbital phase 
of the binary (whose evolution also depends on the masses of 
the binary companions) 1 12l ll3ll . Here, i coa i and cj) coa \ are the 
time and phase of the binary coalescence when the waveform 
is terminated, known as the coalescence time and coalescence 
phase. Details about the waveform near this time are uncer- 
tain but are expected to be at a frequency higher than LIGO's 
sensitive band for the systems considered in this paper. We 
define i coa i to be the time at which the gravitational wave 
frequency becomes infinite within the restricted second-post- 
Newtonian formalism. The restricted second-post-Newtonian 
waveform is considered sufficient for use as a detection tem- 
plate for searches for binary neutron star systems. 

The gravitational wave strain induced in a particular detec- 
tor depends on the detector's antenna response to the two po- 
larizations of the gravitational waveform. The induced strain 
on the detector is given by 



h(t) = F+h+(t) + F x h x (t) 



(3.2) 



where F + and F x are the antenna response functions for the 
incident signal; these functions depend on the location of the 
source with respect to the horizon of the detector and on the 
polarization angle fl4ll . They are very nearly constant in time 
over the duration of the short inspiral signal. Thus the strain 
on a particular detector can be written as 



h(ty- 



t -t 



-1/4 



GM \ 

c 2 D eS J \5GM/c 3 / 
x cos[20 o - 20(f - t ; M, /*)] 



(3.3a) 



We assume that a binary inspiral waveform is adequately 
described (for binary neutron star systems and sub-solar mass 
binary black hole systems) by the restricted post-Newtonian 
waveform. The two polarizations of the gravitational wave 
produced by such a system depends on a monotonically- 
increasing frequency and amplitude as the orbit radiates away 
energy and decays; the waveform, often called a chirp wave- 
form, is given by 



1 + COS 2 l ( GM \ ( tcoal - t 



2 {^d) UgX/c 3 ) 
x cos[20 coal - 24>(t - t C03 x;M,(i)], (3.1a) 

(GM\ ( t coal -t ^ 1 



h+(t)= 



x sin[20 coal - 2<j>{t - t coa i; M, p,)] (3.1b) 



We use the term reverse rather than inverse since the inverse DFT would 
include an overall normalization factor of 1/N. 



where 



D, 



off 
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(3.3b) 



is the effective distance of the source, 2 to is the termination 
time (the time at the detector at which the coalescence occurs, 
i.e., the detector time when the gravitational wave frequency 
becomes infinite) and (j>o is the termination phase which is 



2 The effective distance of the source is related to the true distance of the 
source by several geometrical factors that relate the source orientation with 
the detector orientation. Because the location and orientation of the source 
are not likely to be known when filtering data from a single detector, is 
convenient to combine the geometric factors with the true distance to give a 
single observable, the effective distance. For an optimally-oriented source 
(one that is directly overhead and is orbiting in the plane of the sky) the 
effective distance is equal to the true distance; for sub-optimally-oriented 
sources the effective distance is greater than the true distance. The location 
and distance can be estimated using three or more detectors, but we do not 
consider this here. 
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related to the coalescence phase by 

' F x 2 cos l 



2(f>o = 2</> coa i + arctan 



F+ 1 



(3.3c) 



Equation 13 . 3 al gives a waveform that is used as a tem- 
plate for a matched filter. Since FINDCHIRP implements the 
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matched filter via a Fast-Fourier Transform (FFT) correlation, 
it is beneficial to write the Fourier transform of the template 
and implement it directly rather than taking the FFT of the 
time-domain waveform of Eq. J3.3ai . A frequency-domain 
version of the waveform can be obtained via the stationary 
phase approximation 1 1 5 ] : 



Kf) 

where 



5tA 1/2 (GM\ 



24 J \ c 3 J \c 2 D cfi J \ c 
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, (3.4c) 
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(3.4d) 



and W has been written to second post-Newtonian order. 
Second-order post-Newtonian stationary phase waveforms 
will provide acceptable detection templates for binary neutron 
stars and sub-solar mass black holes 1 16]. This template wave- 
form has been expressed in terms of several factors: (1) An 
overall distance factor involving the effective distance, D c g — 
for a template waveform, we are free to choose this effective 
distance to any convenient unit, and in the FINDCHIRP code 
it is chosen to be 1 Mpc. (2) A constant (in frequency) fac- 
tor A\ Mpc{M, p), which has dimensions of (time) -1 / 6 , that 
depends only on the total and reduced masses, M and fj,, of 
the particular system. (3) The factor /~ 7 / 6 which does not 



J 



depend on the system parameters. And (4) a phasing factor in- 
volving the phase \&(/; M, fi) which is both frequency depen- 
dent and dependent on the system's total and reduced masses. 
We will see below that an efficient application of the matched 
filter will make use of this factorization of the stationary phase 
template. 

In order to construct a waveform template we need to know 
how long the binary system will radiate gravitation waves in 
the sensitivity band of LIGO. A true inspiral chirp waveform 
would be essentially infinitely long, but the amount of time 
that the binary system spends radiating gravitational waves 
with a frequency above some low frequency cutoff /i ow is fi- 
nite: the duration of the chirp or chirp time from a given fre- 
quency /i ow is given to second post-Newtonian order by 



T, 



chirp 



5 GM 
256?7 c 3 
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(3.5a) 



where 



( GM 

^low — I ^3 TT/low 



(3.5b) 



High mass systems coalesce much more quickly (from a given 
/low) than low mass systems. A search for low mass systems, 
such as primordial black holes, can require very long wave- 
form templates (of the order of tens of minutes) which can 



result in a significant computational burden. 

There is also a high frequency cutoff of the inspiral wave- 
form. Physically, at some high frequency a binary system 
will terminate its secular inspiral and the orbit will decay on a 
dynamical time-scale, though identifying such a frequency is 
very difficult except in extreme mass ratio limit rj — * 0. In this 
limit, that of a test mass orbiting a Schwarzschild black hole, 
the frequency is known as the innermost stable circular orbit 
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or ISCO. The ISCO gravitational wave frequency is 

/isco = (3 ' 6) 

However, long before obtaining this frequency, the binary 
components will be orbiting with sufficiently hight orbital ve- 
locities that the higher order corrections to the second post- 
Newtonian waveform will become significant. Indeed, away 
from the test mass limit, the meaning of the ISCO becomes 
rather suspect. We regard Eq. d3.6i as an upper limit on the 
frequency that can be regarded as representing an "inspiral" 
waveform — not as the frequency to which we can trust our in- 
spiral waveform templates. With this understanding, we nev- 
ertheless use this as a high frequency cutoff for the inspiral 
template waveforms (should this frequency be less than the 
Nyquist frequency of the data). For low mass binary systems 
(binary neutron stars or sub-solar mass black holes) the second 
post Newtonian template waveforms are expected to be reli- 
able within the sensitive band of LIGO so the precise choice 
of the high frequency cutoff is not important. 

IV. MATCHED FILTER 

The matched filter is the optimal filter for detecting a signal 
in stationary Gaussian noise. Suppose that s(t) is a stationary 
Gaussian noise process with one-sided power spectral S s (/) 
density given by = ±S s (\f\)8(f-f>). Then the 

matched filter output of a data stream s(t) (which now may 
contain a signal in addition to the noise) with a filter template 
h(t) is 

x{ t)^AKeJ^W^le^df. (4.1) 

Notice that the use of a FFT will allow one to search for all 
possible arrival times efficiently. However, the waveforms de- 
scribed above have additional unknown parameters. These 
are (i) the amplitude (or effective distance to the source), (ii) 
the coalescence phase, and (iii) the binary companion masses. 
The amplitude simply sets a scale for the matched filter out- 
put, and is unimportant for matched filter templates (these can 
be normalized). The unknown phase can be searched over 
efficiently by forming the sum in quadrature of the matched 
filter output for one phase (say 20o = 0) and an orthogonal 
phase (say 2</>o = tt/2). An efficient method to do this is to 
form the complex matched filter output 

z(t) = x(t) + iy(t) = 4^°° S{f s^ f \ f) ^ ft df. (4.2) 

(notice the lower bound of the integral is zero); then the 
quantity |z(t)| 2 is the quadrature sum of the two orthogonal 
matched filters. Here, y(t) is the matched filter output for the 

template h 2 ^ 2 <i> -^/2 (/) = h(f)e l7r/2 = ih. 

To search over all the possible binary companion masses it 
is necessary to construct a bank of matched filter templates 
laid out on a m\-ra 2 plane sufficiently finely that any true 



system masses will produce a waveform that is close enough 
to the nearest template. There are well known strategies for 
constructing such a bank For our purposes, we shall 

simply introduce an index m = 0, . . . , Nt — 1 labeling the 
particular waveform template h m (t) in the bank of Nt wave- 
form templates. 

By convention, the waveform templates are constructed for 
systems with an effective distance of D c ^ — 1 Mpc. To con- 
struct a signal-to-noise ratio, a normalization constant for the 
template is computed: 

2 a f°° |felMpc,m(/)| 2 . . 

^- 4 X m 1 ( } 

The quantity a m is a measure of the sensitivity of the instru- 
ment. For s(t) that is purely stationary and Gaussian noise, 

(xrn(t)) = (y m (t)) = and (a&(t)) = (^(t)) = ^ 
while for a detector output that corresponds to a signal at dis- 
tance D e g, S(t) = (Asff/1 Mpc) _1 /l 1M pc,ro(*), (x m (t)) = 

a^/Dcff. Thus the quantity 

f.\ \ z m(t)\ 

Pm{t) = (4-4) 

&m 

is the amplitude signal-to-noise ratio of the (quadrature) 
matched filter. It is highly unlikely to obtain p m 1 for 
purely stationary and Gaussian noise so a detection strategy 
usually involves setting a threshold on p m to identify event 
candidates. For such candidates, an estimate of the effective 
distance to the candidate system is Z? c ff = (a m / p m ) Mpc. 

The goal of the FINDCHIRP algorithm is largely to construct 
the quantity p m (t). 

V. DETECTOR OUTPUT AND CALIBRATION 

LIGO records several interferometer channels. The gravi- 
tational wave channel (the primary channel for searching for 
gravitational waves) is formed from the output of a photo- 
diode at the antisymmetric (or "dark") port of the interferom- 
eter 11711. This output is used as an error signal for a feedback 
loop that is needed to keep various optical cavities in the inter- 
ferometer in resonance or "in-lock." Hence it is often called 
the error signal e(t). The error signal is not an exact measure 
of the differential arm displacements of the interferometer so 
it does not correspond to the gravitational wave strain. Rather 
it is part of a linear feedback loop that controls the position of 
the interferometer mirrors. A gravitational wave strain equiv- 
alent output, called s[t) above, can be obtained from the error 
signal e(t) via a linear filter. This is called calibration. In the 
frequency domain, the process of calibration can be thought 
of as multiplying the error signal by a complex response func- 
tion, R(f): 

s(f)=R(fW). (5.1) 

Details on the calibration of the LIGO interferometers can be 
found in fTlfTl . 

The detector output is not a continuous signal but rather 
a time series of samples of e(t) taken with a sample rate of 
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1/At = 16384 Hz where At is the sampling interval. Thus, 
rather than e(t), the input to FINDCHIRP is a discretely sam- 
pled set of values e[j] = e(i sta rt + jAt) for some large num- 
ber of points. The start of the data sample is at time f start- 
Data from the detector is divided into science segments which 
are time epochs when the instrument was in-lock and exhibit- 
ing normal behavior. However, these science segments are not 
normally processed as a whole but are divided into smaller 
amounts. In this paper we shall call the amount of data pro- 
cessed a data block of duration Tbi oc k- The data block must be 
long enough to form a reliable noise power spectral estimate 
(see below), but not so long as to exhaust a computer's mem- 
ory or to experience significant non-stationary changes in the 
detector noise. 

The number of points in a data block is further subdivided 
into Ng data segments or just segments (not to be confused 
with the science segments described above) of duration T. 
The duration of the segment is always an integer multiple of 
the sample rate At, so the number of points in a segment N 
is an integer. These segments are used to construct an average 
noise power spectrum and to perform the matched filtering. 
The segments are overlapped so that the first segment consists 
of the points e[j] for j = 0, . . . , N — 1, the second consists of 
the points j = A, . . . , A + N — 1 where A is known as the 
stride, and so on until the last segment which consists of the 
points j = {N s - 1) A, ...,(Ns- 1) A + N - 1. Note that 



Tbiock = [(N s -l)A + N]At. 



(5.2) 



We usually choose to overlap the segments by 50% so that the 
stride is A = N/2 (and N is always even) and hence there 
are iVg = 2(Tbi oc k/r) — 1 segments. The values of Tbio C k> 
T, At, and Ns must be commensurate so that these relations 
hold. 

The FINDCHIRP algorithm implements the matched filter by 
a FFT correlation. Thus a discrete Fourier transform of the the 
individual data segments, n, 



N-l 

£ 



G n [k] = Aty e\j — nA]e 



-2nijk/N 



(5.3) 



for n — 0, . . . , Ns — 1 are constructed via an FFT. Here k is 
a frequency index that runs from to N — 1. The k = com- 
ponent represents the DC component (/ = 0) which is purely 
real, the components < k < [(N— 1)/2J are all positive fre- 
quency components corresponding to frequencies kAf where 
Af = l/(NAt), and the components [N/2\ < k < N 
are all negative frequency components corresponding to fre- 
quencies (k — N)Af . If N is even (as it always is for the 
FINDCHIRP algorithm) then there is also a purely real Nyquist 
frequency component k = N/2 corresponding to the fre- 
quency ±NAf/2 = ±1/(2 At). Recall |aj is the greatest 
integer less than or equal to a. Note that because the error 
signal data is real, the discrete Fourier transform of it satis- 
fies e* [k] = e n [N - k]. Thus, the findchirp algorithm only 
stores the frequency components k = 0, . . . , L-^/2j , and these 
can be efficiently computed using a real-to-half-complex for- 
ward FFT (2(J. 



The detector strain for segment n can be computed by cali- 
brating the error signal: 



s n [k] = R[k]e n [k] 



(5.4) 



where R[k] is the complex response function. As before, since 
s n [k] must be the Fourier transform of some real time series, 
only the frequency components k = 0, . . . , [N/2\ need to be 
computed. LIGO is sensitive to strains that are smaller than 
~ 10~ 20 , while the error signal is designed to have typical 
values much closer to unity. Often the FINDCHIRP algorithm 
will require quantities that are essentially squares of the mea- 
sured strain (e.g., the power spectrum described in the next 
section). To avoid floating-point over- or under-flow prob- 
lems, the strain can simply be rescaled by a dynamical range 
factor k: 



R[k] — > nR[k) so s n [k] 



Ik]- 



(5.5) 



Choosing a value of k ~ 10 20 will keep all quantities within 
representable floating point numbers. It is important to keep 
track of the factor n to make sure it cancels out in all of the re- 
sults. Essentially this is achieved by multiplying all quantities 
with "units" of strain by the factor k within the implemen- 
tation of the FINDCHIRP algorithm. Thus, in addition to the 
response function, the signal template must also be scaled by 

K. 

Note that if the FINDCHIRP algorithm is used to analyze 
data that has already been preprocessed into strain data then 
all the equations in the remainder of this paper still hold with 
the understanding that the response function is identically 
unity and the error signal is the strain data. The following 
replacements thus need to be made: nR[k] — > 1 and e — * ns. 



VI. AVERAGE POWER SPECTRUM 

Part of the matched filter involves weighting the data by 
the inverse of the detector's power spectral density. The de- 
tector's power spectrum must be obtained from the detector 
output. The most common method of power spectral estima- 
tion is Welch's method. Welch's method 12111 for obtaining 
the average power spectrum S e of the error signal is: 



i Ns-l 

S ^ = at £ P *.»M- 



(6.1) 



Here 



Pe,n [k] 



2A/ 
W 



N-l 

At ^ e n \j]w\j]e- 2 ^ k / N 

3=0 



(6.2) 



is a normalized periodogram for a single segment n which 
is the modulus-squared of the discrete Fourier transform of 
windowed data. The data window is given by w[j] and W is a 
normalization constant 



JV-l 



W 



N ^ 



(6.3) 



3=0 
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FINDCHIRP allows a variety of possible windows, but a Hariri 
window (see, e.g., j^l) is the default choice used by FIND- 
CHIRP. The power spectrum of the detector strain-equivalent 
noise is related to this by S s [k] — \nR[k]\ 2 S e [k]. We call this 
average power spectrum the mean average power spectrum. 

The problem with using Welch's method for power spectral 
estimation is that for detector noise containing significant ex- 
cursions from "normal" behavior (due to instrumental glitches 
or, perhaps, very strong gravitational wave signals), the mean 
used in Eq. can be significantly biased by the excursion. 
An alternative that is pursued in the FINDCHIRP algorithm is 
to replace the mean in Eq. J6. It by a median, which is a more 
robust estimator of the average power spectrum: 



S e [k] 



1 x median{P e)0 [A;],Pe,iM, ■ • • ,P t 



e,N s - 



-i[k]}, 
(6.4) 

where a is a required correction factor. When a = 1, the ex- 
pectation value of the median is not equal to the expectation 
value of the mean in the case of Gaussian noise; hence the 
factor a is introduced to ensure that the same power spectrum 
results for Gaussian noise. In Ref . 1 23 ] and in AppendixlBlit is 
shown that if the set {P e .o[fc], P e .i [k], ■ ■ ■ , P e ,jv s -i[^]} are in- 
dependent exponentially-distributed random variables (as ex- 
pected for Gaussian noise) then 



N S 

£ 

n=l 



-1) 



n+1 



(odd N s ) 



(6.5) 



is the correction factor. We call this median estimate of the 
average power spectrum, corrected by the factor a, the median 
average spectrum. 



Unfortunately this result is not exactly correct either. Be- 
cause the segments used to form the individual sample values 
Pe,n [k] of the power at a given frequency are somewhat over- 
lapping (unless A > N), they are not independent random 
variables (as was assumed in Appendix 151. (This is some- 
what mitigated by the windowing of the segments of data.) 
Although the effect is not large, and simply amounts to a 
slight scaling of what is meant by signal-to-noise ratio, we 
are led to propose a variant of the median method in which 
the n = 0, . . . , iVg — 1 overlapping segments are divided into 
even segments (for which n is even) and the odd segments 
(for which n is odd). If the stride is A > N/2 then no two 
even segments will depend on the same data so the even seg- 
ments will be independent; similarly the odd segments will 
be independent. The average power spectrum can be esti- 
mated by taking the mean of the median power spectrum of 
the ^3/2 even segments and the median power spectrum of 
the ATg/2 odd segments, each of which are corrected by a fac- 
tor a appropriate for the sample median with Ng/2 samples. 
We call this the median-mean average spectrum. Like the me- 
dian spectrum it is not overly sensitive to a single glitch (or 
strong gravitational wave signal). 

The FINDCHIRP algorithm can compute the mean average 
spectrum, the median average spectrum, or the median-mean 
average spectrum. Traditionally the median average spectrum 
has been used though we expect that the median-mean average 
spectrum will be adopted in the future. 

VII. DISCRETE MATCHED FILTER 



The discretized version of Eq. ( 14. 21 is simply: 



r 1 _ i \ f ^ V" J KS »M K ^L Mpc.mW 2irijk/N A \ -f J Kfl [ fc ] e "[ fc 1 K ^Mpc,mW 2vijk/N n , , 



Element j of z n ^ m [j] corresponds to the matched filter output for time t = t start + (nA + j) At where i s t a rt is the start time of the 
block of data analyzed. Note that the sum is over the positive frequencies only, and DC and Nyquist frequencies are excluded in 
FINDCHIRP. (The interferometer is AC coupled so it has no sensitivity at the DC component; similarly, the instrument has very 
little sensitivity at the Nyquist frequency so rejecting this frequency bin has very little effect.) This inverse Fourier transform 
can be performed by the complex reverse FFT (as opposed to a half-complex-to-real reverse FFT) of the quantity 



'0 k < k 



low 



l(N- 1)/2J <k<N. 



I.e., the DC, Nyquist, and negative frequency components are 
all set to zero, as are all frequencies below some low fre- 
quency cutoff /i ow = fci ow A/ (which should be chosen to 
some frequency lower than the detector's sensitive band). The 
low frequency cutoff limits the duration of the inspiral tem- 
plate as described below. 



Our task is to obtain an efficient decomposition of the fac- 
tors making up z„ jm [fc]. Note that there needs to be one re- 
verse FFT performed per segment per template. It desirable 
that this (unavoidable) computational cost dominate the eval- 
uation of the matched filter, so we wish to make the computa- 
tion cost of the calculation of z n ^ m [k] for all k to be less than 
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the computation cost of a FFT. We will consider this in the 
next section. 

One subtlety in the construction of the matched filter is the 
issue of filter wrap-around. The matched filter of Eq. i7.1\ can 
be thought of as digital correlation of a filter hi Mpc.mL?'] w i m 
some suitably over- whitened data stream (the data divided by 
the noise power spectrum). Although hi Mpc.m [k] is generated 
in the Frequency-domain via the stationary phase approxima- 
tion, we can imagine that it came from a time-domain signal 
hiM P c,m[j] of duration T c hirp,m that is given by Eq. ( I3.5a> 
for the low frequency cutoff /i ow . By convention of template 
generation, the coalescence is taken to occur with to corre- 
sponding to j = 0. Thus, then entire chirp waveform is 
non-zero only from t = to — T c hirp,m to t = t . Because 
the discrete Fourier transform presumes that the data is pe- 
riodic, this is represented by having the chirp begin at point 
j — N — Tchirp.m/Ai and end at point j = N — 1. Thus 
hiM pc ,m[j] = for j = 0, . . . , AT — 1 — T chirp . m /At. The 
correlation of hi Mpc, m[j] with the interferometer data will 
involve multiplying the T c hi rp .m/Ai points of data before a 
given time with the T c hir P ,m/A< points of the chirp. When 
this is performed by the FFT correlation, this means that the 
first T c hir P ,m/Ai points of the matched filter output involve 
data at times before the start of the segment, which are in- 
terpreted as the data values at the end of the segment (since 
the FFT assumes that the data is periodic). Hence the first 
Tchirp, m/ At points are of the correlation are invalid and must 
be discarded. That is, of the N points of z n m [j] in Eq. ( 17. 11 1, 
only the points j — Xchj rPiTO /Ai, . . . , N — 1 are valid. Re- 
call that the analysis segments of data are overlapped by an 
amount N — A: this is to ensure that the matched filter output 
is continuous (except at the very beginning of a data block). 
That is, only points j = T cri irp,m/Ai, . . . , N — 1 of zo jm are 
valid and only points j = T cri i rp .m/Ai, . . . , N — 1 of z\^ m 
are valid, but points j — A, . . . , N — 1 of zo,m[j] corre- 
spond to points j = 0, . . . , N — A — 1 of zi )m [j], and these 
can be used instead. Therefore FINDCHIRP must ensure that 
the amount that the data segments overlap, N — A points, is 
greater-than or equal-to the duration, T c hi rp ,m/At points, of 
the filter: T chirP:m /Ai < N - A. 

The quantity that needs to be computed in Eq. ( 17. U is 
more than just a correlation of the data e n [j] with the filter 
hi Mpc,m [j] '■ h also involves a convolution of the data with the 
response function and the inverse of the power spectrum. The 
interferometer has a relatively short impulse response, so this 
convolution will only corrupt a short amount of data (though 
now at the end as well as at the beginning of a analysis seg- 
ment). However, the inverse of the power spectrum has many 
very narrow line features that act as sharp notch-filters when 
applied to the data. These filters have an impulse response that 
is as long as the reciprocal of the resolution of the frequency 
series, which is set by the amount of data used to compute the 
periodograms that are used to obtain the average spectrum. 
Since this is the same duration as the analysis segment dura- 
tion, the convolution of the data with the inverse power spec- 
trum corrupts the entire matched filter output. 

To resolve this problem we apply a procedure to coarse- 
grain the inverse power spectrum called inverse spectrum 



truncation. Our goal is to limit the amount of the matched fil- 
ter that is corrupted due to the convolution of the data with the 
inverse power spectrum. To do this we will obtain the time- 
domain version of the frequency-domain quantity 5 , ~ 1 [A:], 
truncate it so that it has finite duration, and then find the quan- 
tity Q e [k] corresponding to this truncated time-domain filter. 
Note that S , ( 7 1 [A:] is real and non-negative, and we want Q e [k] 
to share these properties. First, construct the quantity: 

JV-l 

q[j] =AfJ2 VysMe- 27TlJk/N (7-3) 

fe=0 

which can be done via a half-complex-to-real reverse FFT. 
Since S e [k] is real and symmetric (S^fc] = S e [N — k]), q[j] 
will be real and symmetric (so that q[j] — q[N — j]). This 
quantity will be non-zero for all N points, though strongly- 
peaked near j = and j = N — 1. Now create a truncated 
quantity ^truncate [j] with a total duration of T spoc (T spcc /2 at 
the beginning and T spcc /2 at the end): 

(q\j\ < j < T spec /2At 
9truncato[j] = < T spcc /2At < j < N - T spcc /2At . 
[q\j] N — T spec /2At < j < N 

(7.4) 

Since ^truncate ls real and symmetric, the discrete Fourier 
transform of (/truncate will also be real and symmetric, though 
not necessarily positive. Therefore we construct: 

Qe[k] = truncate W- ( 7 - 5 ) 

This quantity is real, positive, and symmetric, as desired. Mul- 
tiplying the data by Q e [k] in the frequency domain is equiv- 
alent to convolving the data with qtiuncatcL?] in the time do- 
main twice, which will have the effect of corrupting a duration 
of T spec of the matched filter z 7lim [j) at the beginning and at 
the end of the data segment. This is in addition to the duration 
7chirp,m that is corrupted at the beginning of the data segment 
due to the correlation with the filter hi Mpc,m [j] ■ Thus the total 
duration that is corrupted is 2T spcc + T c hhp.m, and this must 
be less than the time that adjacent segments overlap. The net 
effect of the inverse spectrum truncation is to smear-out sharp 
spectral features and to decrease the resolution of the inverse 
power spectrum weighting. 

For simplicity, we normally choose a 50% overlap (so that 
A = N/2). Of each data segment the middle half with 
j = N/4, . . . , 3 N/4 — 1 is assumed to be valid matched fil- 
ter output. Therefore, the inverse truncation duration T spec 
and the maximum filter duration T c hi rp! r?i must satisfy T spec + 
T c hirp,m < T/4 since a time T spoc + T ch i rp , m is corrupted at 
the beginning of the data segment. 



VIII. WAVEFORM DECOMPOSITION 

Our goal is now to construct the quantity 

z n ,m[k] — 4-: „r,n 2 K R[k]e n [k]Khl Mpcm [k] (8.1) 
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as efficiently as possible. This quantity must be computed 
for every segment n, every template to, and every frequency 
bin in the range k = fc low , . . . , kugh,m - 1 where fei ow = 
L/iow/A/J and fchigh.m is the high-frequency cutoff of the 
waveform template, which is given by the minimum of the 
ISCO frequency of Eq. d3.6i and the Nyquist frequency: 

fchigh,™ - min { L/isco/A/J , l(N + 1)/2J } (8.2) 

(recall that the ISCO frequency depends on the binary sys- 
tem's total mass so it is template-dependent). We can factorize 
z n , m [k] as follows: 

z n , m [k] = 4(A/)- 1 A 1M pc,mi ;i n[fc]exp(-i* m [fc]) (8.3) 

where AiMpc.m is a template normalization (it needs to be 
computed once per template but does not depend on k), $„, [k] 



is a template phase which must be computed at all values of k 
for every template (but does not depend on the data segment), 
and F n [k] is the findchirp data segment that must be computed 
for all values of k for each data segment (but does not depend 
on the template). FINDCHIRP first computes and stores the 
quantities F n [k] for all data segments. Then, for each template 
to in the bank, the phasing \& m [A;] is computed once and then 
applied to all of the data segments (thereby marginalizing the 
cost of the template generation). 

To facilitate the factorization, we rewrite Eq. P.4at in the 
discrete form: 

^lMpc.mW = (A/)- 1 v4iM P c, m fc~ 7/6 exp(i* m [fc]) (8.4a) 
with 



A\ Mpc 



1/2 (GMq/c 2 \ (GM, 



\ 1 Mpc / \ 



-1/6 



1/2 



M 



1/3 



(8.4b) 



* m [fc] =-tt/4- 



128?7 



,-5 



[k] 



3715 



V 756 9 



+ "7T 7 ? v - 



\k] - l^v- 2 [k] 



15 293 365 27145 



508 032 



504 



3085 



l [k] 



(8-4c) 



and 



V m [k] 



GMr. 



nAf 



1/3 



M 



1/3 



fc 1/3 . (8.4d) 



The dependence on the data segment is wholly contained in 
the template-independent quantity F n [k] which is 



(8.5) 



As mentioned earlier, FINDCHIRP computes and stores F n [k] 
for all segments only once, and then reuses these pre- 
computed spectra in forming z n ,m[k] according to Eq. d8.3l >. 
The dependence on the template is wholly contained in the 
data-segment- independent quantity G m [k] which is 



G m [k] 



Jexp(-i* r 
10 



Mow ^ k < fchigh,' 

otherwise 



(8.6) 



This quantity is known as the findchirp template. 

The value of a m is also needed in order to normalize 

Zn,m\j]- It is 



^=4A/ J2 



Qe[k}\Kh 1Mp c, m [k}\' 



\nR[k}\ 2 

= ^lMpc,m^ : ' : [^high,m] (8.7) 



k— k\ t 
12 ^2r 



where 



^ [^high,m] 



4 



E 

k—k\ avi 



Qe[k]k 



-7/3 



KR 



(8.8) 



needs to be computed only once per data block (i.e., only once 
per power spectrum) — it does not depend on the particular 
segment within a block or on the particular template in the 
bank, except in the high-frequency cutoff of the template (if it 
is less than the Nyquist frequency). To account for this min- 
imal dependence on the template, the quantity <, 2 [fchigh.m] is 
pre-computed for all values of fchigh- 

The division of the matched filter into the data-segment- 
only quantity F n [k] and the template-only quantity G m [fc] 
means that FINDCHIRP can efficiently compute the matched 
filter, or, rather, a quantity that is proportional to it: 



N-l 

Cm.nbl = J2 F n\k]G, 
fc=0 



We 



2-rrijk/N 



(8.9) 



Notice that ( m ,n [j], which is a complex quantity, can be com- 
puted using a simple un-normalized inverse FFT of the quan- 
tity F n [k]G m [k]. findchirp computes and stores F n [k] for 
each of the TVg segments in the data block and then, for each 
template m in the bank, G m [k] is computed and used to filter 
each of the A*s segments. This means that for each data seg- 
ment and template the computational cost is essentially lim- 
ited to ~ N complex multiplications plus one complex inverse 
FFT. 
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The quantities ( m .n[j] and z m . n [j] are simply related by a 
normalization factor: 

[j] = ^lMpc,roCm,nL?]- (8.10) 

Furthermore, the signal-to-noise ratio is related to Cm,n[j] via 



2 

Pm,n 



|Cm,n[j]| 2 A 2 [fchigh,.. 



(8.11) 



Rather than applying this normalization to construct the 
signal-to-noise ratio, FINDCHIRP instead scales the desired 
signal-to-noise ratio threshold to obtain a normalized 
threshold 



(8.12) 



which can be directly compared to the values |Cm,n \j]\ 2 to de- 
termine if there is a candidate event (when |Cm,n[j]| 2 > C 2 )- 
When an event candidate is located, the value of the signal-to- 
noise ratio can then be recovered for that event along with an 
estimate of the termination time, to — t star t + (nA+j poa k) At 
where j poa k is the point at which |Cm,n[j]| is peaked; the ef- 
fective distance of the candidate, 

D cfi = rr p f\ M P c; (8 - 13) 

|Sm,n [JpcakJ | 

and the termination phase of the candidate, 

20o = arg Cm,n [jpcak] • (8.14) 

IX. THE CHI-SQUARED VETO 

The FINDCHIRP algorithm employs the chi-squared dis- 
criminator of Ref. 0] to distinguish between plausible signal 
candidates and common types of noise artifacts. This method 
is a type of time-frequency decomposition that ensures that 
the matched-filter output has the expected accumulation in 
various frequency bands. (Noise artifacts tend to excite the 
matched filter at the high frequency or the low frequency, but 
seldom produce the same spectrum as an inspiral.) 

For data consisting of pure Gaussian noise, the real and 
imaginary parts of ( m ,n [j] (for a given value of j) are indepen- 
dent Gaussian random variables with zero mean and variance 
^ 2 [fchigh,™]- If there is a signal present at an effective distance 
D eS then {Rs( m . n [j}) = (A m <; 2 [k high . m }/D cff ) cos2</> and 
(ImCm.nL?]) = (A m <; 2 [k high . m )/D eS ) sin 2(j} (at the termi- 
nation time, where cf>o is the termination phase). 

Now consider the contribution to ( m ,n [j] coming from var- 
ious frequency sub-bands: 



(e,m,n[j] 



E 



F n [k]G m [k]e 



2-rrijk/N 



(9.1) 



for £ = 1, . . . ,p. The p sub-bands are defined by the fre- 
quency boundaries {k — fci ow , fci, . . . , k„ = fcjnghnj}, which 
are chosen so that a true signal will contribute an equal amount 



to the total matched filter from each frequency band. This 
means that the values of kp must be chosen so that 



A ke 

— y 

Af ^ 



-7/3 



\R[kW 



<> [^high,m] 

p 



(9.2) 



With this choice of bands and in pure Gaussian noise, the real 
and imaginary parts of Ce,m,n [j] will be independent Gaussian 
random variables with zero mean and variance <j 2 [fchigh.mj/p- 
Furthermore, the real and imaginary parts of Ce,m.n[j] and 
Ci',m,n[j] with I 7^ £' will be independent since Ce,m,n[j] and 
Ci',m,n[j] are constructed from disjoint bands. Also note that 



Cm ,n [j] 



Ce,m,n\j] 



(93) 



The chi-squared statistic is now constructed from 0,m.n[j] 
as follows: 



»,nb1 = E 



1=1 



\Q,m,n[j] ~ Cm,n\j]/P\' 
^ 2 [khigh,m]/p 



(9.4) 



For pure Gaussian noise, \ 2 is chi-squared distributed with 
v = 2p — 2 degrees of freedom. That v = 2p — 2 rather than 
v = 2p results from the fact that the sample mean C m „[j]/p is 
subtracted from each of values of Ci,m,n[j] in the sum. How- 
ever, this subtraction guarantees that, in the presence of a sig- 
nal that exactly matches the template hi Mpcm (up to an ar- 
bitrary amplitude factor and a coalescence phase), the value 
of x 2 is unchanged. Thus, \ 2 is chi-squared distributed with 
v = 2p — 2 degrees of freedom in Gaussian noise with or 
without the presence of an exactly-matched signal. 

If there is a small mismatch between a signal present in 
the data and the template, which would be expected since the 
templates are spaced on a grid and are expected to provide a 
close match but not a perfect match to a true signal, then x 2 
will acquire a small non-central parameter. This is because 
the mismatched signal may not shift the mean value of the 
real parts of {Ce, m ,n} by the same amount (for each £), and 
similarly the mean values of the imaginary parts of {^,m,n} 
may not be shifted by the same amounts. The effect on the 
chi-squared distribution is to introduce a non-central parame- 
ter that is no larger than A max = 2Sa^ n /D^ s where D c s is the 
effective distance of the true signal and S is the mismatch be- 
tween the true signal and the template hi Mpcm, which could 
be as large as the maximum mismatch of the template bank 
that is used |2]. 

Even for small values of 6 (3% is a canonical value), a 
large value of \ 2 can be obtained for gravitational waves 
from nearby binaries. Therefore, one should not adopt a fixed 
threshold on \ 2 l es t very loud binary inspirals be rejected 
by the veto. For a non-central chi-squared distribution with 
v degrees of freedom and a non-central parameter of A, the 
mean of the distribution is v + A while the variance is be- 
tween one- and two-times the mean (the variance equals twice 
the mean when A = and the variance equals the mean for 
A ^> v). Thus it is useful to adopt a threshold on the quantity 
X 2 ' j{y + A), which would be expected to be on the order of 
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unity even for very large signals. The FINDCHIRP algorithm 
adopts a threshold on the related quantity 



(9.5) 



Sometimes the quantity r 2 = \ 2 /p is referred to, rather than 
S, but this quantity does not include the effect of the non- 
central parameter. 



samples j Q - T spcc / At < j < j + (T spcc + T chilp , m ) / At. 
Rather than record triggers for all samples in which the signal- 
to-noise threshold is exceeded while the chi-squared test is 
satisfied, FINDCHIRP has the option of maximizing over a 
chirp: essentially clustering together triggers that lie within 
a time T chilp . m . Algorithmically, whenever |Cm,n[j]| 2 > C* 
and S m „[j] < S*, a trigger is created with a value of p and 
X 2 ■ If this trigger is within a time T c hii P , m after an earlier trig- 
ger with a larger value of the signal-to-noise ratio p, discard 
the current trigger (it is clustered with the previous trigger). 
If this trigger is within a time T c hirp,m after an earlier trigger 
with a smaller signal-to-noise ratio p, discard the earlier trig- 
ger (the previous trigger is clustered with the current trigger). 
The result is a set of remaining triggers that are separated by at 
time of at least T c hi rp ,m. Note that this algorithm depends on 
the order in which the triggers are selected, i.e., a different set 
of triggers may arise if the triggers are examined in inverse 
order of j rather than in order of j. FINDCHIRP applies the 
conditions as j is advanced from j — N / 4 to j — 3N/ 4—1 
(i.e., forward in time). 4 The effect of the maximizing over a 
chirp is to retain any true signal without introducing any sig- 
nificant bias in parameters, e.g., time of arrival (which can be 
demonstrated by simulations), while reducing the number of 
triggers that are produced by noise artifacts. 



X. TRIGGER SELECTION 

The signal-to-noise ratio threshold is the primary parame- 
ter in identifying candidate events or triggers. As we have 
said, the FINDCHIRP algorithm does not directly compute the 
signal-to-noise ratio, but rather the quantity Cm,n\j] given in 
Eq. ( 18. 9> . whose square modulus is then compared to a nor- 
malized threshold given by Eq. (18. 12i . The computational 
cost of the search is essentially the cost of O(N) complex 
multiplications plus 0(N log N) operations to perform the re- 
verse FFT of Eq. (18.91 . and an additional O(N) operations to 
form the square modulus of ( m ,n[j] for all j. In practice, the 
computational cost is dominated by the reverse complex FFT. 

Triggers that exceed the signal-to-noise ratio threshold are 
then subjected to a chi-squared test. However, the construc- 
tion of Xm, n [j] is much more costly than the construction of XI. EXECUTION OF THE FINDCHIRP ALGORITHM 
(m,n[j] simply because p reverse complex FFTs of the form 
given by Eq. ( I9.lt must be performed. 3 The cost of perform- 
ing a chi-squared test is essentially p times as great as the 
cost of performing the matched filter. FINDCHIRP will only 
perform the chi-squared test if a threshold-crossing trigger is 
found. Therefore, if threshold-crossing triggers are rare then 
the cost of the chi-squared test is small compared to the cost 
of the filtering. Note that other methods are currently un- 
der investigation. For example, in an analysis that requires 
triggers to be coincident between two different detectors, the 
chi-squared test can be disabled on a first pass of trigger gen- 
eration on individual detectors and then only applied on the 
triggers that survive the coincidence criteria. 

A true signal in the data is expected to produce a sharp 
peak in the matched filter output at almost exactly the cor- 
rect termination time to (usually within one sample point of 
the correct time in simulations). For sufficiently loud signals, 
however, a signal-to-noise threshold may be crossed for sev- 
eral samples even though the correct termination time will 
have a much greater signal-to-noise ratio than nearby times. 
Non-Gaussian noise artifacts may produce many threshold- 
crossing triggers, often for a duration similar to the duration 
of the inspiral template that is used. In principle, a large im- 
pulse in the detector output at sample jo can cause triggers for 



In this section, we describe the sequence of operations that 
the comprise the FINDCHIRP algorithm and highlight the tun- 
able parameters of each operation. Since we are only describ- 
ing the FINDCHIRP algorithm itself, we assume that a bank of 
templates (M, p) m has already been constructed for a given 
minimal match 6, according to the methods described in 00 
and we are provided with the output error signal of the inter- 
ferometer e[j] and instrument calibration R[k] as described in 
d. 

The initial operation is to divide the detector data into data 
segments e n [j] suitable for analysis, and so the data segment 
duration T, stride length A, and number of data segments Ns 
in a block must be selected. These quantities then define a data 
block length according to Eq. ( I5.2i . A sample rate 1/ At must 
be chosen (which must be less than or equal to the sample 
rate of the detector data acquisition system); the sample rate 
and data segment length define the number of points in a data 
segment N = T / At. As mentioned previously, the lengths 
and sample rate are chosen so that N is an integer power of 
two. 

The first operation is construction of an un-calibrated av- 
erage power spectrum S e using a specified data window w[j] 
and power spectrum estimation method (Welch's method, the 
median method, or the median-mean method). The number of 
periodograms used in the average power spectrum estimate is 



3 If Xm n L?1 ' s om y required for one particular j then there is a more efficient 
way to compute it. However, the FINDCHIRP algorithm does not employ the 
chi-squared test so much as a veto as a part of a constrained maximization 
of signal-to-noise for times in which the chi-squared condition is satisfied. 
Thus, x m n [j] needs to be computed for all j if it is computed at all. 



4 Other methods can also be employed, for example maximizing all triggers 
that are separated in time by less than T c ^i lp m . 
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typically chosen to be equal to the number of data segments, 
although different numbers of periodograms could be chosen. 
An inverse spectrum duration T spcc is then given in order to 
construct the truncated inverse power spectrum Q e [k], accord- 
ing to Eqs. ( !7.3> -( f73l . The calibration is then applied by di- 
viding the quantity Q e [k]by the modulus squared of the scaled 
response function |K_R[fc]| 2 . 

Each input data segment is Fourier transformed and multi- 
plied by the scaled response function to obtain (niR[k])e n [k]. 
The quantity F n [k] described in Eq. d8.5l > can then be con- 
structed. All frequency components of F n [k] below a speci- 
fied low frequency cutoff /i ow are set to zero, as are the DC 
and Nyquist components. The template independent normal- 
ization constants <? 2 [£;high,m] described in Eq. d8.8t i are also 
computed at this point. 

The algorithm now commences a loop over the Ap tem- 
plates in the bank, using the specified signal-to-noise-ratio and 
chi-squared thresholds, and S*, and the method of maxi- 
mizing over triggers. For each template (M, fJ,) m , the find- 
chirp template G m [/c] is computed, according to Eq. d8.6l >. 
The high frequency cutoff fchigh,m for the template is ob- 
tained using Eq. ( 13.61 and used to select the correct value 
of ? 2 [fchi g h.m] for the template. The normalized signal-to- 
noise threshold is then computed for this template according 
toEq. d8~T2l 

An inner loop over the findchirp data segments is then en- 
tered. For each findchirp segment F n [k] and findchirp tem- 
plate G m [fc] the filter output Cm,n is computed according to 
Eq. ( 18.91 1. The trigger selection algorithm described in Sec.PO 
is now used to determine if any triggers should be generated 
for this data segment and template, given the supplied thresh- 
olds and trigger maximization method. If necessary, the chi- 
squared veto is computed at this stage, according to Eq. ( 19. 4> 
and the threshold given in Eq. ( 19.51 . If any triggers are gener- 
ated, the template parameters (M , fi) m are stored, along with 
the termination time to, signal-to-noise ratio, effective dis- 
tance D c ff, termination phase <fio, chi-squared veto parame- 
ters, and the normalization constant er 2 ^ of the trigger. The 
triggers are generated and stored to disk for later stages of the 
analysis pipeline. 

The segment index n is then incremented and the loop over 
the data segments continues. Once all Ag data segments have 
been filtered against the template, the template index m is in- 
cremented and the loop over templates continues until all At 
templates have been filtered against all Ag data segments. 



XII. CONCLUSION 

Profiling of the inspiral search code based on the FIND- 
CHIRP algorithm was performed on a 3 GHz Pentium 4 CPU 
with a 7 data segments of length 256 seconds. The data was 
read from disk, re-sampled from 16 384 Hz to 4096 Hz and fil- 
tered against a bank containing 474 templates using the FFTW 
package 12(111 to perform the discrete Fourier transforms; the 
resulting 1255 triggers were written out to disk. Of the 2909 
seconds of execution time, 1088 seconds were spent perform- 
ing complex FFTs required by the matched filter, and 1600 



seconds performing the chi-squared veto. Of the time taken 
to perform the chi-squared veto, 1244 seconds are spent ex- 
ecuting the again spent doing inverse FFTs. In total, 2300 
seconds of the 2900 are spent doing FFTs, which means that 
the execution of the FINDCHIRP algorithm is FFT dominated, 
as desired. 



In practice, the FINDCHIRP algorithm is only a part of the 
search for gravitational waves from binary inspiral. An inspi- 
ral analysis pipeline typically includes data selection, template 
bank generation, trigger generation using FINDCHIRP, trigger 
coincidence tests between multiple detectors, vetoes based on 
instrumental behavior, coherent combination of the optimal 
filter output from multiple detectors, and finally manual can- 
didate followups. Pipelines vary between specific analyses; a 
description of the pipeline used to search for the coalescence 
of binary neutron stars in the first LIGO science run can be 
found in |6], and a description of the pipeline used in the sec- 
ond LIGO science run to search for binary neutron stars and 
binary black hole MACHOs can be found in Qg]. Although 
the use of the FINDCHIRP algorithm is primarily to generate 
triggers for a single detector, sections of the complex signal- 
to-noise vector Cm,n[j] can be written to disk along with the 
triggers. If the same template (M, fi) m is used to filter the 
data from two or more interferometers, this complex signal- 
to-noise data can be used directly as the input to the optimal 
coherent matched filter for binary inspiral signals 1 24] . 



It is simple to modify the FINDCHIRP algorithm to use re- 
stricted post-Newtonian templates higher then second order 
by adding addition terms to the construction of the findchirp 
template phase in Eq. d8.4cb . It is expected, however, that 
post-Newtonian templates will be inadequate to search for the 
coalescence of higher-mass binary black holes in the sensitive 
band of the LIGO detectors. The motion of the binary will be 
highly relativistic and the perturbative post-Newtonian calcu- 
lations will no longer be valid. There are two main approaches 
for searching for such high mass systems, which we briefly 
mention here. The first approach is to use a detection template 
family (DTF), such as the BCV DTF El^. These tem- 
plates are frequency domain waveforms designed to capture 
the characteristics of non-spinning and spinning high mass 
systems accurately enough for detection in a matched filter 
search that is still computationally accessible. The modifica- 
tions to the FINDCHIRP algorithm to implement the BCV DTF 
are extensive, and beyond the scope of this paper; we refer to 
Izl for further details. The second approach to detecting high 
mass systems is to use time domain templates bases on post- 
Newtonian re-summation techniques, such as the effective one 
body (EOB) 1 28] or Pade approximants 1 29] . In Appendix 1X1 
we describe the modifications necessary to use arbitrary time 
domain waveforms in the FINDCHIRP algorithm. These mod- 
ifications cannot make use of the factorization used in the sta- 
tionary phase templates, but they allow efficient re-use of the 
search code developed and tested for the frequency domain 
post-Newtonian templates. 
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APPENDIX A: ALGORITHM FOR TEMPLATES 
GENERATED IN THE TIME DOMAIN 

The optimization of the FINDCHIRP algorithm described 
above is dependent on the use of frequency domain restricted 
post-Newtonian waveforms as the template. It is a simple mat- 
ter, however, to modify the algorithm (and hence the code 
used to implement the algorithm) so that an arbitrary wave- 
form generated in the time domain h(t) may be used as the 
matched filter template. This allows use of inspiral tem- 
plates such as the effective one body (EOB) l28ll and Pade 
re-summation waveforms [29]. These waveforms are thought 
to have a higher overlap with high mass signals in the sensitive 
band of the LIGO detectors. In this Appendix, we describe 
the modifications necessary to use time domain templates in 

FINDCHIRP. 

We assume that the desired template waveform is generated 
in the form 

hlMpc,m(t) = A m (t - t ) cos [20 o - 2<p m (t - t )] (Al) 

where to and <fio are the termination time and phase, as de- 
scribed in Sec.[in| and A m (t) and <fi m (t) are the particular am- 
plitude and phase evolution for the m-th template in the bank. 
The bank may include parameterization over binary compo- 
nent spins as well as masses. The template waveform is gen- 
erated from the low frequency cut off f\ ow and is normalized 
to the canonical distance of 1 Mpc. Recall the factorization of 
the matched filter output, given by Eq. fl8.3i : 



[k] = 4(A/)- 1 A 1M pc,™K[fc]G m [&] 



(A2) 



Since we are now only provided with the numerical value of 
the waveform as a function of time, we cannot perform the 
same factorization of the waveform as for stationary phase 
templates. Instead, to compute the findchirp data segment 
F n [k], we remove the template dependent amplitude by mak- 
ing the replacement fc~ 7 / 6 — > 1 in Eq. d8 .51 . Similarly, the 
form of Ai Mpc,m is now much simpler, as it becomes just the 
dynamic range scaling factor A\ M pc ,m = K needed to scale 
the waveform to avoid floating point underflow. 

To construct the findchirp template G m [k], we construct a 
segment of length N sample points and populate it with the 
discrete samples of the template waveform hi m P c, m[j)- The 
waveform is sampled at the sampling interval of the matched 
filter At. When we place the waveform in this segment, we 
must ensure that the termination of the waveform is place at 



the sample point j = 0, i.e. to = 0. For example, if the tem- 
plate is a second order post-Newtonian waveform generated 
in the time domain 1 13], then it is typical to end the wave- 
form generation at the time which the frequency evolution of 
the waveform ceases to be monotonically increasing and not 
the time at which the gravitational wave frequency goes to in- 
finity. Thus the last non-zero sample point of the generated 
template may not correspond to the termination time. In prac- 
tice this generally means placing the template near the end of 
the segment with zero padding after the last non-zero sample 
point of the waveform so that if the frequency evolution had 
been continued, the termination time would be the (wrapped- 
around) sample point j = 0. After placing the waveform in 
the segment, we construct the discrete forward Fourier trans- 
form of the waveform, as described by Eq. \23\ and construct 



Gn 



hi Mpc,m[fc] 




k\ow 5: k < fchigh 

otherwise. 



gh,m 



(A3) 



Finally, we construct the normalization constant 



— y 



Q e [k]\h 



1 Mpc,m [ 



\nR[k]\< 



(A4) 



which is now dependent on the template parameters. Once 
we have constructed these quantities we may proceed with 
the FINDCHIRP algorithm described in Sec. I Villi and Sec. ITxl 
to obtain the signal-to-noise ratio and the value of the chi- 
squared veto for the particular template we have chosen. The 
computational operations required per template are increased 
by 0(N log N) for the additional real-to-half-complex for- 
ward FFT to construct hi M pc ,m, and O(N) operations to con- 
struct dL 



APPENDIX B: BIAS IN MEDIAN POWER SPECTRUM 
ESTIMATION 

Here we compute the bias a of the median of a set of 
periodograms relative to the mean of a set of periodograms. 
We assume that the periodograms are obtained from Gaussian 
noise. In this Appendix, let us focus on one frequency bin of 
the periodogram, and for brevity we adopt the symbol x for 
the power in the frequency bin, that is, we define xi — P e ^ [k] 
for I = 1 , . . . , n. (Here n is the number of periodograms in 
being averaged. It is either N$ or A^s/2 depending on the 
choice of method.) Let f(x) be the distribution function for 
x. For Gaussian noise, f{x) — pT 1 e~ x ' lx where 



/i = (x) = / xf(x)dx 



(Bl) 



is the population mean of x. So p = (P e [k]). The population 
median is defined by 



1 f x l/a 



f(x)dx 



(B2) 
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which yields 



Xx/2 = //In 2. 



(B3) 



Thus the bias of the population median is a = In 2. 

The sample mean is unbiased compared to the population 
mean. The sample mean is: 



1 - 

n — ' 



(B4) 



The expected value of x is 



1 n 

n ± — ' 



:r.e) 



(B5) 



e=o 



so x is an unbiased estimator of fj,. 

The sample median, however, does have a bias. The sample 
median is: 



Xmcd = nicdianja^}. 



(B6) 



To compute the bias, we first need to obtain the probability 
distribution for the sample median. 

For simplicity, assume now that n is odd. The probabil- 
ity of the sample median having a value between x med and 
x mc d + dx mco \ is proportional to the probability of one of the 
samples having a value between x mcd and x med +dx med times 
the probability that half of the remaining samples are larger 
than x mc d and the other half are smaller than x med . Thus, the 
probability distribution for x med is given by g(x me d) where 



g(Xmed) dx mcd — 

C[l - Q(x n 



c d)} m Q m (x med )f(x med ) dx mcd . (B7) 



where m = (n — l)/2 is half of the remaining samples after 
one has been selected as the median. Here, Q(x) is the upper- 
tail probability of x, i.e., the probability that a sample exceeds 
the value x: 



/>oo 

Q(x) = / f(x) dx = e~ x >» 

J X 



(B8) 



where the second equality holds for the exponential distribu- 
tion function corresponding to the power in Gaussian noise. 
The normalization factor C is a combinatoric factor which 
arises from the number of ways of selecting a particular sam- 
ple as the median and then choosing half of the remaining 
points to be greater than this value. Thus it has the value 
of n (number of ways to select the median sample) times 
n — 1 = 2m choose (n — l)/2 = m (number of ways of 
choosing half the points to be larger): 



G 



n x 



2m + 
m 



1 



B(m + l,m + 1) 



(B9) 



This factor can also be obtained simply by normalizing the 
probability distribution g(x med ). To do so it is useful to make 



the substitution t = Q(x mcd ) so that dt = f(x med ) dx med 
and 

3(^mcd) dx mcd = g{t) dt = Ct m {\ - t) m dt. (BIO) 

Now that the probability distribution is known, we can com- 
pute the expected value for the median. Note that for the ex- 
ponential probability distribution x med = —fxlnt. Thus 



\Xmcd ) 



=/i X 



B(m + l,m + 1) J 



2m+l 



The bias factor is therefore 



t m (l-t) m lntdt 

(Bll) 

(B12) 



for odd n. This result makes sense: As n — > oo the series 
approaches In 2 which is the bias for the population median. 
However, for n = 1, a = 1, so there is no bias (the median is 
equal to the mean for one sample!). 



APPENDIX C: CHI-SQUARED STATISTIC FOR A 
MISMATCHED SIGNAL 



For simplicity we write the chi-squared statistic in the 
equivalent form [cf. Eq. ( 17. 1H 



2T.1 Ijgjjj - Z[j]/P\ 2 

X UJ <J 2 /p 
1=1 ,r 



(CI) 



where 



z i \j]=4Af ^ mh }^ c[k] e 2 ^ N , (C2) 



k—kp 



S s [k] 



(C3) 



and 



a 2 = 4A/ 



L(W-l)/2j ~ 2 

l«iMpcFJr 



fe=i 



S s [k] 



(C4) 



For brevity we have dropped the indices n and m; the ex- 
plicit dependence on j will also be dropped hereafter. In this 
Appendix we further simplify the notation by adopting nor- 
malized templates u[k] = /iiMpc[fc]/c. In terms of these tem- 
plates we define the inner products 



k—kp 



S s [k] 
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for the p different bands, which are chosen so that (it, u)t 
1/p, and the inner product 



( S ,n) = 5>,^=4A/ ]T 



[iN - 1)/2i 3[k]V[k] 



k=l 



S s [k] 



(C6) 



With this notation, the signal-to-noise ratio is given by p 2 
\(s, it) | 2 and chi-squared statistic is 



\(s,u) e - (s,u)/p\ 2 

1/p 

p 

(s,u)\ 2 +p^2 \(s,u) e 



(C7) 



To see how the chi-squared statistic is affected by a strong 
signal (considerably larger than the noise), suppose that the 
detector output s[j] consists of the gravitational waveform 
av[j] where a specifies the amplitude of the gravitational 
wave. Here v[j] is also a normalized [in terms of the inner 
product of Eq. ( IC6H gravitational waveform that is not exactly 
the same as u[j] . The discrepancy between the two waveforms 
is given by the mismatch: 



S = l-\(v,u) 



(C8) 



The mismatch is the fraction of the signal-to-noise ratio that 
is lost by filtering the true signal av[j] with the template u[j] 



compared to if the template v[j] were used. The chi-squared 
statistic is 



X 2 =-a 2 \(v,u)\ 2 +pa 2 J2\(v,u) e \ 2 . 

p 

<-a 2 \(v,u)\ 2 + pa 2 y^^(v,v)e(u,u)i 



-p 2 + a 2 < 2p 2 5 



(C9) 



where we have used the Schwarz inequality to obtain the sec- 
ond line and the normalization condition (u,u)e — 1/p to 
obtain the third. Thus, the chi-squared statistic is offset by an 
amount that is bounded by twice the squared signal-to-noise 
ratio observed times the mismatch factor. There is no offset 
for a template that perfectly matched the signal waveform. 

It can be shown 0] that in the presence of a signal and 
Gaussian noise that \ 2 has a non-central chi-squared distribu- 
tion 1 30] with v — 2p—2 degrees of freedom and a non-central 
parameter A < 2p 2 S (where now A may possibly be slightly 
greater than the 25 times the measured signal-to-noise ratio 
squared owing to the presence of the noise). This distribution 
has a mean value of v + A and a variance of 2v + 4A. We see 
then that the modified chi-squared statistic 5 of Eq. ( 19. 5> has 
a mean of < 2 and a variance of < (4 or 8)/ (p + p 2 8) (4 when 
p 3> p 2 8 and 8 when p <C p 2 8) for Gaussian noise. Thus we 
would expect to set a threshold on 5 of ~ a few. 
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